Overview

Dataset statistics

Number of variables20
Number of observations239677
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory38.4 MiB
Average record size in memory168.0 B

Variable types

Numeric9
DateTime1
Categorical10

Alerts

state has a high cardinality: 51 distinct values High cardinality
city_or_county has a high cardinality: 12898 distinct values High cardinality
address has a high cardinality: 198037 distinct values High cardinality
gun_type has a high cardinality: 2502 distinct values High cardinality
participant_age has a high cardinality: 18951 distinct values High cardinality
participant_age_group has a high cardinality: 898 distinct values High cardinality
participant_gender has a high cardinality: 873 distinct values High cardinality
participant_relationship has a high cardinality: 284 distinct values High cardinality
participant_status has a high cardinality: 2150 distinct values High cardinality
participant_type has a high cardinality: 259 distinct values High cardinality
incident_id is highly correlated with n_guns_involvedHigh correlation
n_guns_involved is highly correlated with incident_idHigh correlation
state is highly correlated with congressional_district and 4 other fieldsHigh correlation
n_killed is highly correlated with n_injuredHigh correlation
n_injured is highly correlated with n_killedHigh correlation
congressional_district is highly correlated with stateHigh correlation
latitude is highly correlated with state and 1 other fieldsHigh correlation
longitude is highly correlated with state and 1 other fieldsHigh correlation
state_house_district is highly correlated with stateHigh correlation
state_senate_district is highly correlated with stateHigh correlation
n_killed is highly skewed (γ1 = 55.55636018) Skewed
n_guns_involved is highly skewed (γ1 = 67.45361155) Skewed
incident_id has unique values Unique
latitude has 7923 (3.3%) zeros Zeros
longitude has 7923 (3.3%) zeros Zeros

Reproduction

Analysis started2022-10-13 12:44:22.885844
Analysis finished2022-10-13 12:45:21.400613
Duration58.51 seconds
Software versionpandas-profiling v3.3.0
Download configurationconfig.json

Variables

incident_id
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct239677
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean559334.3464
Minimum92114
Maximum1083472
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.7 MiB
2022-10-13T22:45:21.480614image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum92114
5-th percentile126858.6
Q1308545
median543587
Q3817228
95-th percentile1028801.6
Maximum1083472
Range991358
Interquartile range (IQR)508683

Descriptive statistics

Standard deviation293128.6843
Coefficient of variation (CV)0.5240670203
Kurtosis-1.22417132
Mean559334.3464
Median Absolute Deviation (MAD)253242
Skewness0.09516790421
Sum1.340595781 × 1011
Variance8.592442555 × 1010
MonotonicityNot monotonic
2022-10-13T22:45:21.608614image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4611051
 
< 0.1%
7179821
 
< 0.1%
7184851
 
< 0.1%
7185201
 
< 0.1%
7196021
 
< 0.1%
7187911
 
< 0.1%
7177831
 
< 0.1%
7173931
 
< 0.1%
7176851
 
< 0.1%
7196381
 
< 0.1%
Other values (239667)239667
> 99.9%
ValueCountFrequency (%)
921141
< 0.1%
921171
< 0.1%
921191
< 0.1%
921221
< 0.1%
921251
< 0.1%
921291
< 0.1%
921311
< 0.1%
921331
< 0.1%
921351
< 0.1%
921371
< 0.1%
ValueCountFrequency (%)
10834721
< 0.1%
10834661
< 0.1%
10834571
< 0.1%
10834351
< 0.1%
10834281
< 0.1%
10834131
< 0.1%
10833961
< 0.1%
10833901
< 0.1%
10833891
< 0.1%
10833791
< 0.1%

date
Date

Distinct1725
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
Minimum2013-01-01 00:00:00
Maximum2018-12-03 00:00:00
2022-10-13T22:45:21.734614image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:21.863614image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

state
Categorical

HIGH CARDINALITY
HIGH CORRELATION

Distinct51
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
Illinois
17556 
California
 
16306
Florida
 
15029
Texas
 
13577
Ohio
 
10244
Other values (46)
166965 

Length

Max length20
Median length13
Mean length8.664310718
Min length4

Characters and Unicode

Total characters2076636
Distinct characters46
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPennsylvania
2nd rowCalifornia
3rd rowOhio
4th rowColorado
5th rowNorth Carolina

Common Values

ValueCountFrequency (%)
Illinois17556
 
7.3%
California16306
 
6.8%
Florida15029
 
6.3%
Texas13577
 
5.7%
Ohio10244
 
4.3%
New York9712
 
4.1%
Pennsylvania8929
 
3.7%
Georgia8925
 
3.7%
North Carolina8739
 
3.6%
Louisiana8103
 
3.4%
Other values (41)122557
51.1%

Length

2022-10-13T22:45:21.993613image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
new17708
 
6.3%
illinois17556
 
6.2%
california16306
 
5.8%
carolina15678
 
5.5%
florida15029
 
5.3%
texas13577
 
4.8%
ohio10244
 
3.6%
york9712
 
3.4%
north9312
 
3.3%
pennsylvania8929
 
3.2%
Other values (45)148989
52.6%

Most occurring characters

ValueCountFrequency (%)
a256726
12.4%
i246073
11.8%
n176038
 
8.5%
o171176
 
8.2%
s155405
 
7.5%
e121747
 
5.9%
r118926
 
5.7%
l116103
 
5.6%
t56154
 
2.7%
h49637
 
2.4%
Other values (36)608651
29.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1753428
84.4%
Uppercase Letter279845
 
13.5%
Space Separator43363
 
2.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a256726
14.6%
i246073
14.0%
n176038
10.0%
o171176
9.8%
s155405
8.9%
e121747
6.9%
r118926
6.8%
l116103
6.6%
t56154
 
3.2%
h49637
 
2.8%
Other values (14)285443
16.3%
Uppercase Letter
ValueCountFrequency (%)
C41447
14.8%
M33743
12.1%
N30623
10.9%
I27481
9.8%
T21203
 
7.6%
O15985
 
5.7%
F15029
 
5.4%
A11990
 
4.3%
W10290
 
3.7%
Y9712
 
3.5%
Other values (11)62342
22.3%
Space Separator
ValueCountFrequency (%)
43363
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin2033273
97.9%
Common43363
 
2.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a256726
12.6%
i246073
12.1%
n176038
 
8.7%
o171176
 
8.4%
s155405
 
7.6%
e121747
 
6.0%
r118926
 
5.8%
l116103
 
5.7%
t56154
 
2.8%
h49637
 
2.4%
Other values (35)565288
27.8%
Common
ValueCountFrequency (%)
43363
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII2076636
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a256726
12.4%
i246073
11.8%
n176038
 
8.5%
o171176
 
8.2%
s155405
 
7.5%
e121747
 
5.9%
r118926
 
5.7%
l116103
 
5.6%
t56154
 
2.7%
h49637
 
2.4%
Other values (36)608651
29.3%

city_or_county
Categorical

HIGH CARDINALITY

Distinct12898
Distinct (%)5.4%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
Chicago
 
10814
Baltimore
 
3943
Washington
 
3279
New Orleans
 
3071
Philadelphia
 
2963
Other values (12893)
215607 

Length

Max length46
Median length43
Mean length9.355365763
Min length3

Characters and Unicode

Total characters2242266
Distinct characters59
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5031 ?
Unique (%)2.1%

Sample

1st rowMckeesport
2nd rowHawthorne
3rd rowLorain
4th rowAurora
5th rowGreensboro

Common Values

ValueCountFrequency (%)
Chicago10814
 
4.5%
Baltimore3943
 
1.6%
Washington3279
 
1.4%
New Orleans3071
 
1.3%
Philadelphia2963
 
1.2%
Saint Louis2501
 
1.0%
Houston2501
 
1.0%
Milwaukee2487
 
1.0%
Jacksonville2448
 
1.0%
Memphis2386
 
1.0%
Other values (12888)203284
84.8%

Length

2022-10-13T22:45:22.105613image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
chicago11796
 
3.8%
city6413
 
2.1%
county6348
 
2.0%
new5148
 
1.7%
saint4315
 
1.4%
baltimore3966
 
1.3%
washington3401
 
1.1%
san3353
 
1.1%
orleans3157
 
1.0%
beach2994
 
1.0%
Other values (8671)258802
83.6%

Most occurring characters

ValueCountFrequency (%)
a200004
 
8.9%
o181617
 
8.1%
e181294
 
8.1%
n168763
 
7.5%
i152407
 
6.8%
l137985
 
6.2%
t123954
 
5.5%
r117309
 
5.2%
s98636
 
4.4%
70150
 
3.1%
Other values (49)810147
36.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1842443
82.2%
Uppercase Letter306110
 
13.7%
Space Separator70150
 
3.1%
Close Punctuation11462
 
0.5%
Open Punctuation11462
 
0.5%
Dash Punctuation512
 
< 0.1%
Other Punctuation127
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a200004
10.9%
o181617
9.9%
e181294
9.8%
n168763
9.2%
i152407
 
8.3%
l137985
 
7.5%
t123954
 
6.7%
r117309
 
6.4%
s98636
 
5.4%
h62169
 
3.4%
Other values (17)418305
22.7%
Uppercase Letter
ValueCountFrequency (%)
C44556
14.6%
S27436
 
9.0%
B26681
 
8.7%
M21634
 
7.1%
L20173
 
6.6%
P19538
 
6.4%
A15128
 
4.9%
W14652
 
4.8%
H14198
 
4.6%
R12615
 
4.1%
Other values (16)89499
29.2%
Other Punctuation
ValueCountFrequency (%)
'74
58.3%
.53
41.7%
Space Separator
ValueCountFrequency (%)
70150
100.0%
Close Punctuation
ValueCountFrequency (%)
)11462
100.0%
Open Punctuation
ValueCountFrequency (%)
(11462
100.0%
Dash Punctuation
ValueCountFrequency (%)
-512
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin2148553
95.8%
Common93713
 
4.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a200004
 
9.3%
o181617
 
8.5%
e181294
 
8.4%
n168763
 
7.9%
i152407
 
7.1%
l137985
 
6.4%
t123954
 
5.8%
r117309
 
5.5%
s98636
 
4.6%
h62169
 
2.9%
Other values (43)724415
33.7%
Common
ValueCountFrequency (%)
70150
74.9%
)11462
 
12.2%
(11462
 
12.2%
-512
 
0.5%
'74
 
0.1%
.53
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII2242261
> 99.9%
None5
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a200004
 
8.9%
o181617
 
8.1%
e181294
 
8.1%
n168763
 
7.5%
i152407
 
6.8%
l137985
 
6.2%
t123954
 
5.5%
r117309
 
5.2%
s98636
 
4.4%
70150
 
3.1%
Other values (48)810142
36.1%
None
ValueCountFrequency (%)
ñ5
100.0%

address
Categorical

HIGH CARDINALITY

Distinct198037
Distinct (%)82.6%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
2375 International Pkwy
 
16657
6000 N Terminal Pkwy
 
141
Main Street
 
131
3400 E Sky Harbor Blvd
 
127
8500 Peña Blvd
 
99
Other values (198032)
222522 

Length

Max length97
Median length75
Mean length22.50227181
Min length2

Characters and Unicode

Total characters5393277
Distinct characters93
Distinct categories15 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique184863 ?
Unique (%)77.1%

Sample

1st row1506 Versailles Avenue and Coursin Street
2nd row13500 block of Cerise Avenue
3rd row1776 East 28th Street
4th row16000 block of East Ithaca Place
5th row307 Mourning Dove Terrace

Common Values

ValueCountFrequency (%)
2375 International Pkwy16657
 
6.9%
6000 N Terminal Pkwy141
 
0.1%
Main Street131
 
0.1%
3400 E Sky Harbor Blvd127
 
0.1%
8500 Peña Blvd99
 
< 0.1%
2800 N Terminal Rd86
 
< 0.1%
6000 North Terminal Parkway80
 
< 0.1%
1 Terminal Dr62
 
< 0.1%
1 Jeff Fuqua Blvd61
 
< 0.1%
3200 E Airfield Dr57
 
< 0.1%
Other values (198027)222176
92.7%

Length

2022-10-13T22:45:22.247615image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
block83046
 
8.3%
of82994
 
8.3%
street47009
 
4.7%
and34826
 
3.5%
avenue32127
 
3.2%
st31940
 
3.2%
ave21322
 
2.1%
road20226
 
2.0%
pkwy17627
 
1.8%
international17395
 
1.7%
Other values (39852)616856
61.4%

Most occurring characters

ValueCountFrequency (%)
771361
 
14.3%
e406615
 
7.5%
o350182
 
6.5%
t315058
 
5.8%
a261923
 
4.9%
n256455
 
4.8%
r240715
 
4.5%
0234554
 
4.3%
l215068
 
4.0%
i137799
 
2.6%
Other values (83)2203547
40.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter3286464
60.9%
Space Separator771363
 
14.3%
Decimal Number690325
 
12.8%
Uppercase Letter622518
 
11.5%
Other Punctuation19718
 
0.4%
Dash Punctuation2721
 
0.1%
Final Punctuation106
 
< 0.1%
Open Punctuation21
 
< 0.1%
Close Punctuation20
 
< 0.1%
Other Number12
 
< 0.1%
Other values (5)9
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e406615
12.4%
o350182
10.7%
t315058
 
9.6%
a261923
 
8.0%
n256455
 
7.8%
r240715
 
7.3%
l215068
 
6.5%
i137799
 
4.2%
d133862
 
4.1%
k125391
 
3.8%
Other values (19)843396
25.7%
Uppercase Letter
ValueCountFrequency (%)
S125002
20.1%
A67805
10.9%
R45933
 
7.4%
P39020
 
6.3%
W36913
 
5.9%
C34134
 
5.5%
B32824
 
5.3%
D30559
 
4.9%
N29319
 
4.7%
E27076
 
4.3%
Other values (16)153933
24.7%
Decimal Number
ValueCountFrequency (%)
0234554
34.0%
189284
 
12.9%
275558
 
10.9%
362760
 
9.1%
558332
 
8.4%
746499
 
6.7%
440154
 
5.8%
631171
 
4.5%
826838
 
3.9%
925175
 
3.6%
Other Punctuation
ValueCountFrequency (%)
.18286
92.7%
,582
 
3.0%
'314
 
1.6%
/281
 
1.4%
#124
 
0.6%
&106
 
0.5%
;18
 
0.1%
"6
 
< 0.1%
\1
 
< 0.1%
Format
ValueCountFrequency (%)
2
40.0%
1
20.0%
­1
20.0%
1
20.0%
Space Separator
ValueCountFrequency (%)
771361
> 99.9%
 2
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
-2720
> 99.9%
1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
(20
95.2%
[1
 
4.8%
Close Punctuation
ValueCountFrequency (%)
)19
95.0%
]1
 
5.0%
Other Number
ValueCountFrequency (%)
½11
91.7%
¼1
 
8.3%
Final Punctuation
ValueCountFrequency (%)
106
100.0%
Modifier Letter
ValueCountFrequency (%)
ʻ1
100.0%
Line Separator
ValueCountFrequency (%)
1
100.0%
Modifier Symbol
ValueCountFrequency (%)
`1
100.0%
Initial Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin3908982
72.5%
Common1484295
 
27.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
e406615
 
10.4%
o350182
 
9.0%
t315058
 
8.1%
a261923
 
6.7%
n256455
 
6.6%
r240715
 
6.2%
l215068
 
5.5%
i137799
 
3.5%
d133862
 
3.4%
k125391
 
3.2%
Other values (45)1465914
37.5%
Common
ValueCountFrequency (%)
771361
52.0%
0234554
 
15.8%
189284
 
6.0%
275558
 
5.1%
362760
 
4.2%
558332
 
3.9%
746499
 
3.1%
440154
 
2.7%
631171
 
2.1%
826838
 
1.8%
Other values (28)47784
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII5393020
> 99.9%
None145
 
< 0.1%
Punctuation111
 
< 0.1%
Modifier Letters1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
771361
 
14.3%
e406615
 
7.5%
o350182
 
6.5%
t315058
 
5.8%
a261923
 
4.9%
n256455
 
4.8%
r240715
 
4.5%
0234554
 
4.3%
l215068
 
4.0%
i137799
 
2.6%
Other values (68)2203290
40.9%
None
ValueCountFrequency (%)
ñ126
86.9%
½11
 
7.6%
 2
 
1.4%
2
 
1.4%
é1
 
0.7%
¼1
 
0.7%
­1
 
0.7%
í1
 
0.7%
Punctuation
ValueCountFrequency (%)
106
95.5%
1
 
0.9%
1
 
0.9%
1
 
0.9%
1
 
0.9%
1
 
0.9%
Modifier Letters
ValueCountFrequency (%)
ʻ1
100.0%

n_killed
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED

Distinct16
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.123063779
Minimum1
Maximum50
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.7 MiB
2022-10-13T22:45:22.349614image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11.123063779
median1.123063779
Q31.123063779
95-th percentile1.123063779
Maximum50
Range49
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.2292706777
Coefficient of variation (CV)0.2041475132
Kurtosis9576.758969
Mean1.123063779
Median Absolute Deviation (MAD)0
Skewness55.55636018
Sum269172.5574
Variance0.05256504364
MonotonicityNot monotonic
2022-10-13T22:45:22.431614image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=16)
ValueCountFrequency (%)
1.123063779185835
77.5%
148436
 
20.2%
24604
 
1.9%
3595
 
0.2%
4139
 
0.1%
541
 
< 0.1%
611
 
< 0.1%
85
 
< 0.1%
93
 
< 0.1%
72
 
< 0.1%
Other values (6)6
 
< 0.1%
ValueCountFrequency (%)
148436
 
20.2%
1.123063779185835
77.5%
24604
 
1.9%
3595
 
0.2%
4139
 
0.1%
541
 
< 0.1%
611
 
< 0.1%
72
 
< 0.1%
85
 
< 0.1%
93
 
< 0.1%
ValueCountFrequency (%)
501
 
< 0.1%
271
 
< 0.1%
171
 
< 0.1%
161
 
< 0.1%
111
 
< 0.1%
101
 
< 0.1%
93
 
< 0.1%
85
< 0.1%
72
 
< 0.1%
611
< 0.1%

n_injured
Real number (ℝ≥0)

HIGH CORRELATION

Distinct23
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.218252907
Minimum1
Maximum53
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.7 MiB
2022-10-13T22:45:22.679612image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1.218252907
Q31.218252907
95-th percentile2
Maximum53
Range52
Interquartile range (IQR)0.2182529067

Descriptive statistics

Standard deviation0.4183854601
Coefficient of variation (CV)0.3434307095
Kurtosis1205.739971
Mean1.218252907
Median Absolute Deviation (MAD)0
Skewness17.27966916
Sum291987.2019
Variance0.1750463932
MonotonicityNot monotonic
2022-10-13T22:45:22.768612image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
1.218252907142487
59.4%
181986
34.2%
211484
 
4.8%
32513
 
1.0%
4760
 
0.3%
5241
 
0.1%
691
 
< 0.1%
751
 
< 0.1%
819
 
< 0.1%
912
 
< 0.1%
Other values (13)33
 
< 0.1%
ValueCountFrequency (%)
181986
34.2%
1.218252907142487
59.4%
211484
 
4.8%
32513
 
1.0%
4760
 
0.3%
5241
 
0.1%
691
 
< 0.1%
751
 
< 0.1%
819
 
< 0.1%
912
 
< 0.1%
ValueCountFrequency (%)
531
 
< 0.1%
251
 
< 0.1%
201
 
< 0.1%
193
< 0.1%
181
 
< 0.1%
172
< 0.1%
162
< 0.1%
152
< 0.1%
143
< 0.1%
132
< 0.1%

congressional_district
Real number (ℝ≥0)

HIGH CORRELATION

Distinct54
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.016083621
Minimum1
Maximum53
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.7 MiB
2022-10-13T22:45:22.879614image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median6
Q310
95-th percentile26
Maximum53
Range52
Interquartile range (IQR)8

Descriptive statistics

Standard deviation8.260000743
Coefficient of variation (CV)1.030428465
Kurtosis5.609405356
Mean8.016083621
Median Absolute Deviation (MAD)4
Skewness2.162460651
Sum1921270.874
Variance68.22761227
MonotonicityNot monotonic
2022-10-13T22:45:23.007612image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
136910
15.4%
226945
11.2%
320621
 
8.6%
719709
 
8.2%
418469
 
7.7%
516512
 
6.9%
8.01608362112365
 
5.2%
99867
 
4.1%
69388
 
3.9%
87353
 
3.1%
Other values (44)61538
25.7%
ValueCountFrequency (%)
136910
15.4%
226945
11.2%
320621
8.6%
418469
7.7%
516512
6.9%
69388
 
3.9%
719709
8.2%
87353
 
3.1%
8.01608362112365
 
5.2%
99867
 
4.1%
ValueCountFrequency (%)
53130
 
0.1%
52158
 
0.1%
51228
0.1%
50116
 
< 0.1%
49100
 
< 0.1%
48119
 
< 0.1%
47292
0.1%
46251
0.1%
4545
 
< 0.1%
44421
0.2%

gun_type
Categorical

HIGH CARDINALITY

Distinct2502
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
0::Unknown
193010 
0::Handgun
 
13018
0::9mm
 
4599
0::Unknown||1::Unknown
 
2410
0::22 LR
 
2193
Other values (2497)
24447 

Length

Max length5488
Median length10
Mean length12.59401194
Min length5

Characters and Unicode

Total characters3018495
Distinct characters44
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1867 ?
Unique (%)0.8%

Sample

1st row0::Unknown
2nd row0::Unknown
3rd row0::Unknown||1::Unknown
4th row0::Unknown
5th row0::Handgun||1::Handgun

Common Values

ValueCountFrequency (%)
0::Unknown193010
80.5%
0::Handgun13018
 
5.4%
0::9mm4599
 
1.9%
0::Unknown||1::Unknown2410
 
1.0%
0::22 LR2193
 
0.9%
0::Shotgun2151
 
0.9%
0::40 SW1947
 
0.8%
0::380 Auto1844
 
0.8%
0::45 Auto1537
 
0.6%
0::Rifle1522
 
0.6%
Other values (2492)15446
 
6.4%

Length

2022-10-13T22:45:23.142714image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
0::unknown193010
74.1%
0::handgun13018
 
5.0%
auto4773
 
1.8%
0::9mm4599
 
1.8%
0::222602
 
1.0%
lr2558
 
1.0%
0::unknown||1::unknown2410
 
0.9%
0::402240
 
0.9%
sw2217
 
0.9%
0::shotgun2151
 
0.8%
Other values (2722)30842
 
11.8%

Most occurring characters

ValueCountFrequency (%)
n746412
24.7%
:583401
19.3%
0248967
 
8.2%
o240744
 
8.0%
U230605
 
7.6%
k230605
 
7.6%
w230605
 
7.6%
|104403
 
3.5%
u36648
 
1.2%
g33261
 
1.1%
Other values (34)332844
11.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1632333
54.1%
Other Punctuation584341
 
19.4%
Decimal Number374616
 
12.4%
Uppercase Letter294206
 
9.7%
Math Symbol104403
 
3.5%
Space Separator20743
 
0.7%
Dash Punctuation2747
 
0.1%
Open Punctuation2553
 
0.1%
Close Punctuation2553
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n746412
45.7%
o240744
 
14.7%
k230605
 
14.1%
w230605
 
14.1%
u36648
 
2.2%
g33261
 
2.0%
a27531
 
1.7%
d25050
 
1.5%
m14635
 
0.9%
t11204
 
0.7%
Other values (7)35638
 
2.2%
Decimal Number
ValueCountFrequency (%)
0248967
66.5%
128495
 
7.6%
226517
 
7.1%
316236
 
4.3%
412728
 
3.4%
510539
 
2.8%
910164
 
2.7%
88254
 
2.2%
76965
 
1.9%
65751
 
1.5%
Uppercase Letter
ValueCountFrequency (%)
U230605
78.4%
H25050
 
8.5%
R11855
 
4.0%
S8915
 
3.0%
A8421
 
2.9%
L3358
 
1.1%
W2975
 
1.0%
O1065
 
0.4%
M1022
 
0.3%
K940
 
0.3%
Other Punctuation
ValueCountFrequency (%)
:583401
99.8%
.940
 
0.2%
Math Symbol
ValueCountFrequency (%)
|104403
100.0%
Space Separator
ValueCountFrequency (%)
20743
100.0%
Dash Punctuation
ValueCountFrequency (%)
-2747
100.0%
Open Punctuation
ValueCountFrequency (%)
[2553
100.0%
Close Punctuation
ValueCountFrequency (%)
]2553
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1926539
63.8%
Common1091956
36.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
n746412
38.7%
o240744
 
12.5%
U230605
 
12.0%
k230605
 
12.0%
w230605
 
12.0%
u36648
 
1.9%
g33261
 
1.7%
a27531
 
1.4%
d25050
 
1.3%
H25050
 
1.3%
Other values (17)100028
 
5.2%
Common
ValueCountFrequency (%)
:583401
53.4%
0248967
22.8%
|104403
 
9.6%
128495
 
2.6%
226517
 
2.4%
20743
 
1.9%
316236
 
1.5%
412728
 
1.2%
510539
 
1.0%
910164
 
0.9%
Other values (7)29763
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII3018495
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n746412
24.7%
:583401
19.3%
0248967
 
8.2%
o240744
 
8.0%
U230605
 
7.6%
k230605
 
7.6%
w230605
 
7.6%
|104403
 
3.5%
u36648
 
1.2%
g33261
 
1.1%
Other values (34)332844
11.0%

latitude
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct101241
Distinct (%)42.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean36.30542073
Minimum0
Maximum71.3368
Zeros7923
Zeros (%)3.3%
Negative0
Negative (%)0.0%
Memory size3.7 MiB
2022-10-13T22:45:23.260712image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile26.66574
Q133.5778
median38.1818
Q341.2851
95-th percentile43.9448
Maximum71.3368
Range71.3368
Interquartile range (IQR)7.7073

Descriptive statistics

Standard deviation8.397390649
Coefficient of variation (CV)0.2312985356
Kurtosis9.282299171
Mean36.30542073
Median Absolute Deviation (MAD)3.6093
Skewness-2.462578632
Sum8701574.325
Variance70.51616971
MonotonicityNot monotonic
2022-10-13T22:45:23.378717image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
07923
 
3.3%
33.6356253
 
0.1%
39.294244
 
0.1%
29.9872170
 
0.1%
33.4347161
 
0.1%
32.8982160
 
0.1%
38.9075142
 
0.1%
36.1334112
 
< 0.1%
29.9551109
 
< 0.1%
28.436109
 
< 0.1%
Other values (101231)230294
96.1%
ValueCountFrequency (%)
07923
3.3%
19.11141
 
< 0.1%
19.11271
 
< 0.1%
19.21
 
< 0.1%
19.20171
 
< 0.1%
19.42431
 
< 0.1%
19.43311
 
< 0.1%
19.44751
 
< 0.1%
19.45541
 
< 0.1%
19.45781
 
< 0.1%
ValueCountFrequency (%)
71.33681
< 0.1%
71.30051
< 0.1%
71.30011
< 0.1%
71.31
< 0.1%
71.29971
< 0.1%
71.29211
< 0.1%
71.29061
< 0.1%
70.66981
< 0.1%
70.19811
< 0.1%
67.55331
< 0.1%

longitude
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct101241
Distinct (%)42.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean36.30542073
Minimum0
Maximum71.3368
Zeros7923
Zeros (%)3.3%
Negative0
Negative (%)0.0%
Memory size3.7 MiB
2022-10-13T22:45:23.505717image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile26.66574
Q133.5778
median38.1818
Q341.2851
95-th percentile43.9448
Maximum71.3368
Range71.3368
Interquartile range (IQR)7.7073

Descriptive statistics

Standard deviation8.397390649
Coefficient of variation (CV)0.2312985356
Kurtosis9.282299171
Mean36.30542073
Median Absolute Deviation (MAD)3.6093
Skewness-2.462578632
Sum8701574.325
Variance70.51616971
MonotonicityNot monotonic
2022-10-13T22:45:23.623717image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
07923
 
3.3%
33.6356253
 
0.1%
39.294244
 
0.1%
29.9872170
 
0.1%
33.4347161
 
0.1%
32.8982160
 
0.1%
38.9075142
 
0.1%
36.1334112
 
< 0.1%
29.9551109
 
< 0.1%
28.436109
 
< 0.1%
Other values (101231)230294
96.1%
ValueCountFrequency (%)
07923
3.3%
19.11141
 
< 0.1%
19.11271
 
< 0.1%
19.21
 
< 0.1%
19.20171
 
< 0.1%
19.42431
 
< 0.1%
19.43311
 
< 0.1%
19.44751
 
< 0.1%
19.45541
 
< 0.1%
19.45781
 
< 0.1%
ValueCountFrequency (%)
71.33681
< 0.1%
71.30051
< 0.1%
71.30011
< 0.1%
71.31
< 0.1%
71.29971
< 0.1%
71.29211
< 0.1%
71.29061
< 0.1%
70.66981
< 0.1%
70.19811
< 0.1%
67.55331
< 0.1%

n_guns_involved
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED

Distinct107
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.37244163
Minimum1
Maximum400
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.7 MiB
2022-10-13T22:45:23.752717image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q31.37244163
95-th percentile2
Maximum400
Range399
Interquartile range (IQR)0.3724416299

Descriptive statistics

Standard deviation3.578322166
Coefficient of variation (CV)2.607267288
Kurtosis6111.509851
Mean1.37244163
Median Absolute Deviation (MAD)0
Skewness67.45361155
Sum328942.6925
Variance12.80438952
MonotonicityNot monotonic
2022-10-13T22:45:23.871713image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1127548
53.2%
1.3724416399451
41.5%
27477
 
3.1%
32021
 
0.8%
4871
 
0.4%
5435
 
0.2%
6285
 
0.1%
7232
 
0.1%
8137
 
0.1%
9111
 
< 0.1%
Other values (97)1109
 
0.5%
ValueCountFrequency (%)
1127548
53.2%
1.3724416399451
41.5%
27477
 
3.1%
32021
 
0.8%
4871
 
0.4%
5435
 
0.2%
6285
 
0.1%
7232
 
0.1%
8137
 
0.1%
9111
 
< 0.1%
ValueCountFrequency (%)
4004
< 0.1%
3991
 
< 0.1%
3741
 
< 0.1%
3461
 
< 0.1%
3381
 
< 0.1%
3231
 
< 0.1%
3003
< 0.1%
2801
 
< 0.1%
2761
 
< 0.1%
2681
 
< 0.1%

participant_age
Categorical

HIGH CARDINALITY

Distinct18951
Distinct (%)7.9%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
0::24
96112 
0::23
 
3735
0::22
 
3733
0::19
 
3719
0::21
 
3612
Other values (18946)
128766 

Length

Max length517
Median length5
Mean length7.142466736
Min length4

Characters and Unicode

Total characters1711885
Distinct characters13
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique13962 ?
Unique (%)5.8%

Sample

1st row0::20
2nd row0::20
3rd row0::25||1::31||2::33||3::34||4::33
4th row0::29||1::33||2::56||3::33
5th row0::18||1::46||2::14||3::47

Common Values

ValueCountFrequency (%)
0::2496112
40.1%
0::233735
 
1.6%
0::223733
 
1.6%
0::193719
 
1.6%
0::213612
 
1.5%
0::183536
 
1.5%
0::203535
 
1.5%
0::253500
 
1.5%
0::263277
 
1.4%
0::273110
 
1.3%
Other values (18941)111808
46.6%

Length

2022-10-13T22:45:24.006713image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
0::2496112
40.1%
0::233735
 
1.6%
0::223733
 
1.6%
0::193719
 
1.6%
0::213612
 
1.5%
0::183536
 
1.5%
0::203535
 
1.5%
0::253500
 
1.5%
0::263277
 
1.4%
0::273110
 
1.3%
Other values (18941)111808
46.6%

Most occurring characters

ValueCountFrequency (%)
:624586
36.5%
0247809
 
14.5%
2221435
 
12.9%
|147354
 
8.6%
4136589
 
8.0%
1124641
 
7.3%
371904
 
4.2%
534462
 
2.0%
627289
 
1.6%
825757
 
1.5%
Other values (3)50059
 
2.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number939886
54.9%
Other Punctuation624645
36.5%
Math Symbol147354
 
8.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0247809
26.4%
2221435
23.6%
4136589
14.5%
1124641
13.3%
371904
 
7.7%
534462
 
3.7%
627289
 
2.9%
825757
 
2.7%
725657
 
2.7%
924343
 
2.6%
Other Punctuation
ValueCountFrequency (%)
:624586
> 99.9%
.59
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
|147354
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common1711885
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
:624586
36.5%
0247809
 
14.5%
2221435
 
12.9%
|147354
 
8.6%
4136589
 
8.0%
1124641
 
7.3%
371904
 
4.2%
534462
 
2.0%
627289
 
1.6%
825757
 
1.5%
Other values (3)50059
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII1711885
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
:624586
36.5%
0247809
 
14.5%
2221435
 
12.9%
|147354
 
8.6%
4136589
 
8.0%
1124641
 
7.3%
371904
 
4.2%
534462
 
2.0%
627289
 
1.6%
825757
 
1.5%
Other values (3)50059
 
2.9%

participant_age_group
Categorical

HIGH CARDINALITY

Distinct898
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
0::Adult 18+
136790 
0::Adult 18+||1::Adult 18+
49273 
0::Adult 18+||1::Adult 18+||2::Adult 18+
13893 
0::Teen 12-17
 
7392
0::Adult 18+||1::Adult 18+||2::Adult 18+||3::Adult 18+
 
4975
Other values (893)
27354 

Length

Max length1536
Median length12
Mean length20.2124359
Min length11

Characters and Unicode

Total characters4844456
Distinct characters26
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique488 ?
Unique (%)0.2%

Sample

1st row0::Adult 18+||1::Adult 18+||2::Adult 18+||3::Adult 18+||4::Adult 18+
2nd row0::Adult 18+||1::Adult 18+||2::Adult 18+||3::Adult 18+
3rd row0::Adult 18+||1::Adult 18+||2::Adult 18+||3::Adult 18+||4::Adult 18+
4th row0::Adult 18+||1::Adult 18+||2::Adult 18+||3::Adult 18+
5th row0::Adult 18+||1::Adult 18+||2::Teen 12-17||3::Adult 18+

Common Values

ValueCountFrequency (%)
0::Adult 18+136790
57.1%
0::Adult 18+||1::Adult 18+49273
 
20.6%
0::Adult 18+||1::Adult 18+||2::Adult 18+13893
 
5.8%
0::Teen 12-177392
 
3.1%
0::Adult 18+||1::Adult 18+||2::Adult 18+||3::Adult 18+4975
 
2.1%
1::Adult 18+3916
 
1.6%
0::Adult 18+||1::Teen 12-171962
 
0.8%
0::Teen 12-17||1::Adult 18+1914
 
0.8%
0::Adult 18+||1::Adult 18+||2::Adult 18+||3::Adult 18+||4::Adult 18+1736
 
0.7%
0::Teen 12-17||1::Teen 12-171673
 
0.7%
Other values (888)16153
 
6.7%

Length

2022-10-13T22:45:24.137713image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
18223285
36.1%
0::adult214544
34.7%
18+||1::adult73182
 
11.8%
18+||2::adult25269
 
4.1%
12-1714623
 
2.4%
0::teen13169
 
2.1%
18+||3::adult9699
 
1.6%
1::adult5320
 
0.9%
18+||4::adult3776
 
0.6%
18+||1::teen3415
 
0.6%
Other values (307)32132
 
5.2%

Most occurring characters

ValueCountFrequency (%)
:751137
15.5%
1501343
10.3%
378737
7.8%
d353122
7.3%
l353122
7.3%
8348888
7.2%
A348554
7.2%
u348554
7.2%
t348554
7.2%
+348554
7.2%
Other values (16)763891
15.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1489333
30.7%
Decimal Number1192814
24.6%
Other Punctuation751137
15.5%
Math Symbol623515
12.9%
Space Separator378737
 
7.8%
Uppercase Letter378737
 
7.8%
Dash Punctuation30183
 
0.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1501343
42.0%
8348888
29.2%
0238049
20.0%
257349
 
4.8%
726175
 
2.2%
312574
 
1.1%
45075
 
0.4%
52124
 
0.2%
61007
 
0.1%
9230
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
d353122
23.7%
l353122
23.7%
u348554
23.4%
t348554
23.4%
e51230
 
3.4%
n25615
 
1.7%
h4568
 
0.3%
i4568
 
0.3%
Uppercase Letter
ValueCountFrequency (%)
A348554
92.0%
T25615
 
6.8%
C4568
 
1.2%
Math Symbol
ValueCountFrequency (%)
+348554
55.9%
|274961
44.1%
Other Punctuation
ValueCountFrequency (%)
:751137
100.0%
Space Separator
ValueCountFrequency (%)
378737
100.0%
Dash Punctuation
ValueCountFrequency (%)
-30183
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common2976386
61.4%
Latin1868070
38.6%

Most frequent character per script

Common
ValueCountFrequency (%)
:751137
25.2%
1501343
16.8%
378737
12.7%
8348888
11.7%
+348554
11.7%
|274961
 
9.2%
0238049
 
8.0%
257349
 
1.9%
-30183
 
1.0%
726175
 
0.9%
Other values (5)21010
 
0.7%
Latin
ValueCountFrequency (%)
d353122
18.9%
l353122
18.9%
A348554
18.7%
u348554
18.7%
t348554
18.7%
e51230
 
2.7%
T25615
 
1.4%
n25615
 
1.4%
C4568
 
0.2%
h4568
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII4844456
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
:751137
15.5%
1501343
10.3%
378737
7.8%
d353122
7.3%
l353122
7.3%
8348888
7.2%
A348554
7.2%
u348554
7.2%
t348554
7.2%
+348554
7.2%
Other values (16)763891
15.8%

participant_gender
Categorical

HIGH CARDINALITY

Distinct873
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
0::Male
129858 
0::Male||1::Male
43530 
0::Male||1::Male||2::Male
 
12383
0::Female||1::Male
 
10602
0::Female
 
7791
Other values (868)
35513 

Length

Max length803
Median length7
Mean length12.92115222
Min length6

Characters and Unicode

Total characters3096903
Distinct characters21
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique486 ?
Unique (%)0.2%

Sample

1st row0::Male||1::Male||3::Male||4::Female
2nd row0::Male
3rd row0::Male||1::Male||2::Male||3::Male||4::Male
4th row0::Female||1::Male||2::Male||3::Male
5th row0::Female||1::Male||2::Male||3::Female

Common Values

ValueCountFrequency (%)
0::Male129858
54.2%
0::Male||1::Male43530
 
18.2%
0::Male||1::Male||2::Male12383
 
5.2%
0::Female||1::Male10602
 
4.4%
0::Female7791
 
3.3%
0::Male||1::Female5130
 
2.1%
0::Male||1::Male||2::Male||3::Male4333
 
1.8%
1::Male4168
 
1.7%
0::Female||1::Male||2::Male2062
 
0.9%
0::Male||1::Female||2::Male1819
 
0.8%
Other values (863)18001
 
7.5%

Length

2022-10-13T22:45:24.278714image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
0::male129859
54.2%
0::male||1::male43530
 
18.2%
0::male||1::male||2::male12383
 
5.2%
0::female||1::male10602
 
4.4%
0::female7791
 
3.3%
0::male||1::female5130
 
2.1%
0::male||1::male||2::male||3::male4333
 
1.8%
1::male4168
 
1.7%
0::female||1::male||2::male2062
 
0.9%
0::male||1::female||2::male1819
 
0.8%
Other values (863)18001
 
7.5%

Most occurring characters

ValueCountFrequency (%)
:771466
24.9%
e431799
13.9%
a388626
12.5%
l388626
12.5%
M345453
11.2%
|295226
 
9.5%
0232391
 
7.5%
199390
 
3.2%
m43173
 
1.4%
F43172
 
1.4%
Other values (11)57581
 
1.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1252225
40.4%
Other Punctuation771467
24.9%
Decimal Number389359
 
12.6%
Uppercase Letter388625
 
12.5%
Math Symbol295226
 
9.5%
Space Separator1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0232391
59.7%
199390
25.5%
234472
 
8.9%
313564
 
3.5%
45322
 
1.4%
52187
 
0.6%
6991
 
0.3%
7529
 
0.1%
8306
 
0.1%
9207
 
0.1%
Lowercase Letter
ValueCountFrequency (%)
e431799
34.5%
a388626
31.0%
l388626
31.0%
m43173
 
3.4%
f1
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
:771466
> 99.9%
,1
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
M345453
88.9%
F43172
 
11.1%
Math Symbol
ValueCountFrequency (%)
|295226
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1640850
53.0%
Common1456053
47.0%

Most frequent character per script

Common
ValueCountFrequency (%)
:771466
53.0%
|295226
 
20.3%
0232391
 
16.0%
199390
 
6.8%
234472
 
2.4%
313564
 
0.9%
45322
 
0.4%
52187
 
0.2%
6991
 
0.1%
7529
 
< 0.1%
Other values (4)515
 
< 0.1%
Latin
ValueCountFrequency (%)
e431799
26.3%
a388626
23.7%
l388626
23.7%
M345453
21.1%
m43173
 
2.6%
F43172
 
2.6%
f1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII3096903
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
:771466
24.9%
e431799
13.9%
a388626
12.5%
l388626
12.5%
M345453
11.2%
|295226
 
9.5%
0232391
 
7.5%
199390
 
3.2%
m43173
 
1.4%
F43172
 
1.4%
Other values (11)57581
 
1.9%

participant_relationship
Categorical

HIGH CARDINALITY

Distinct284
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
1::Significant others - current or former
226554 
1::Family
 
2573
1::Armed Robbery
 
1874
1::Armed Robbery||2::Armed Robbery
 
1100
1::Friends
 
883
Other values (279)
 
6693

Length

Max length366
Median length41
Mean length40.12421718
Min length8

Characters and Unicode

Total characters9616852
Distinct characters51
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique124 ?
Unique (%)0.1%

Sample

1st row1::Significant others - current or former
2nd row1::Significant others - current or former
3rd row1::Significant others - current or former
4th row1::Significant others - current or former
5th row3::Family

Common Values

ValueCountFrequency (%)
1::Significant others - current or former226554
94.5%
1::Family2573
 
1.1%
1::Armed Robbery1874
 
0.8%
1::Armed Robbery||2::Armed Robbery1100
 
0.5%
1::Friends883
 
0.4%
1::Aquaintance719
 
0.3%
0::Significant others - current or former483
 
0.2%
1::Armed Robbery||2::Armed Robbery||3::Armed Robbery410
 
0.2%
1::Neighbor389
 
0.2%
0::Armed Robbery370
 
0.2%
Other values (274)4322
 
1.8%

Length

2022-10-13T22:45:24.421715image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
229345
16.4%
current227374
16.3%
or227374
16.3%
others227374
16.3%
former227359
16.3%
1::significant226561
16.2%
robbery4704
 
0.3%
1::armed3521
 
0.3%
1::family2573
 
0.2%
perp1912
 
0.1%
Other values (216)20167
 
1.4%

Most occurring characters

ValueCountFrequency (%)
r1384241
14.4%
1158587
12.0%
e706271
 
7.3%
o699669
 
7.3%
i694432
 
7.2%
n692253
 
7.2%
t686513
 
7.1%
:488860
 
5.1%
c457739
 
4.8%
f454748
 
4.7%
Other values (41)2193539
22.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter7222431
75.1%
Space Separator1158587
 
12.0%
Other Punctuation488860
 
5.1%
Uppercase Letter263467
 
2.7%
Decimal Number244477
 
2.5%
Dash Punctuation229486
 
2.4%
Math Symbol9544
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r1384241
19.2%
e706271
9.8%
o699669
9.7%
i694432
9.6%
n692253
9.6%
t686513
9.5%
c457739
 
6.3%
f454748
 
6.3%
m242830
 
3.4%
a236160
 
3.3%
Other values (13)967575
13.4%
Uppercase Letter
ValueCountFrequency (%)
S227374
86.3%
A8989
 
3.4%
R8028
 
3.0%
F4650
 
1.8%
N2101
 
0.8%
V1912
 
0.7%
K1912
 
0.7%
P1912
 
0.7%
I1899
 
0.7%
H1899
 
0.7%
Other values (4)2791
 
1.1%
Decimal Number
ValueCountFrequency (%)
1236169
96.6%
23768
 
1.5%
02291
 
0.9%
31432
 
0.6%
4520
 
0.2%
5176
 
0.1%
663
 
< 0.1%
729
 
< 0.1%
818
 
< 0.1%
911
 
< 0.1%
Space Separator
ValueCountFrequency (%)
1158587
100.0%
Other Punctuation
ValueCountFrequency (%)
:488860
100.0%
Dash Punctuation
ValueCountFrequency (%)
-229486
100.0%
Math Symbol
ValueCountFrequency (%)
|9544
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin7485898
77.8%
Common2130954
 
22.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
r1384241
18.5%
e706271
9.4%
o699669
9.3%
i694432
9.3%
n692253
9.2%
t686513
9.2%
c457739
 
6.1%
f454748
 
6.1%
m242830
 
3.2%
a236160
 
3.2%
Other values (27)1231042
16.4%
Common
ValueCountFrequency (%)
1158587
54.4%
:488860
22.9%
1236169
 
11.1%
-229486
 
10.8%
|9544
 
0.4%
23768
 
0.2%
02291
 
0.1%
31432
 
0.1%
4520
 
< 0.1%
5176
 
< 0.1%
Other values (4)121
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII9616852
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r1384241
14.4%
1158587
12.0%
e706271
 
7.3%
o699669
 
7.3%
i694432
 
7.2%
n692253
 
7.2%
t686513
 
7.1%
:488860
 
5.1%
c457739
 
4.8%
f454748
 
4.7%
Other values (41)2193539
22.8%

participant_status
Categorical

HIGH CARDINALITY

Distinct2150
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
0::Injured
69919 
0::Unharmed, Arrested
25746 
0::Killed
21216 
0::Injured||1::Unharmed
12680 
0::Unharmed
11048 
Other values (2145)
99068 

Length

Max length1500
Median length1066
Mean length22.64605281
Min length8

Characters and Unicode

Total characters5427738
Distinct characters31
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1181 ?
Unique (%)0.5%

Sample

1st row0::Arrested||1::Injured||2::Injured||3::Injured||4::Injured
2nd row0::Killed||1::Injured||2::Injured||3::Injured
3rd row0::Injured, Unharmed, Arrested||1::Unharmed, Arrested||2::Killed||3::Injured||4::Injured
4th row0::Killed||1::Killed||2::Killed||3::Killed
5th row0::Injured||1::Injured||2::Killed||3::Killed

Common Values

ValueCountFrequency (%)
0::Injured69919
29.2%
0::Unharmed, Arrested25746
 
10.7%
0::Killed21216
 
8.9%
0::Injured||1::Unharmed12680
 
5.3%
0::Unharmed11048
 
4.6%
0::Killed||1::Unharmed, Arrested7799
 
3.3%
0::Injured||1::Unharmed, Arrested7488
 
3.1%
0::Unharmed||1::Unharmed7042
 
2.9%
0::Injured||1::Injured6167
 
2.6%
0::Unharmed, Arrested||1::Unharmed, Arrested4658
 
1.9%
Other values (2140)65914
27.5%

Length

2022-10-13T22:45:24.560714image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
0::injured71721
21.8%
arrested62404
19.0%
0::unharmed46059
14.0%
0::injured||1::unharmed22189
 
6.7%
0::killed21265
 
6.5%
0::killed||1::unharmed14657
 
4.5%
0::unharmed||1::unharmed9826
 
3.0%
arrested||1::unharmed8755
 
2.7%
arrested||2::unharmed7947
 
2.4%
0::injured||1::injured6473
 
2.0%
Other values (1177)57636
17.5%

Most occurring characters

ValueCountFrequency (%)
:807299
14.9%
e595548
 
11.0%
r535070
 
9.9%
d496215
 
9.1%
n336404
 
6.2%
|331209
 
6.1%
0236358
 
4.4%
U190383
 
3.5%
h190383
 
3.5%
a190383
 
3.5%
Other values (21)1518486
28.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter3206528
59.1%
Other Punctuation896554
 
16.5%
Uppercase Letter496215
 
9.1%
Decimal Number407977
 
7.5%
Math Symbol331209
 
6.1%
Space Separator89255
 
1.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e595548
18.6%
r535070
16.7%
d496215
15.5%
n336404
10.5%
h190383
 
5.9%
a190383
 
5.9%
m190383
 
5.9%
u146021
 
4.6%
j146021
 
4.6%
l120956
 
3.8%
Other values (3)259144
8.1%
Decimal Number
ValueCountFrequency (%)
0236358
57.9%
1106989
26.2%
238041
 
9.3%
315281
 
3.7%
46156
 
1.5%
52578
 
0.6%
61220
 
0.3%
7673
 
0.2%
8397
 
0.1%
9284
 
0.1%
Uppercase Letter
ValueCountFrequency (%)
U190383
38.4%
I146021
29.4%
A99333
20.0%
K60478
 
12.2%
Other Punctuation
ValueCountFrequency (%)
:807299
90.0%
,89255
 
10.0%
Math Symbol
ValueCountFrequency (%)
|331209
100.0%
Space Separator
ValueCountFrequency (%)
89255
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin3702743
68.2%
Common1724995
31.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e595548
16.1%
r535070
14.5%
d496215
13.4%
n336404
9.1%
U190383
 
5.1%
h190383
 
5.1%
a190383
 
5.1%
m190383
 
5.1%
u146021
 
3.9%
j146021
 
3.9%
Other values (7)685932
18.5%
Common
ValueCountFrequency (%)
:807299
46.8%
|331209
19.2%
0236358
 
13.7%
1106989
 
6.2%
89255
 
5.2%
,89255
 
5.2%
238041
 
2.2%
315281
 
0.9%
46156
 
0.4%
52578
 
0.1%
Other values (4)2574
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII5427738
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
:807299
14.9%
e595548
 
11.0%
r535070
 
9.9%
d496215
 
9.1%
n336404
 
6.2%
|331209
 
6.1%
0236358
 
4.4%
U190383
 
3.5%
h190383
 
3.5%
a190383
 
3.5%
Other values (21)1518486
28.0%

participant_type
Categorical

HIGH CARDINALITY

Distinct259
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
0::Victim
83427 
0::Victim||1::Subject-Suspect
50579 
0::Subject-Suspect
44914 
0::Victim||1::Subject-Suspect||2::Subject-Suspect
10941 
0::Victim||1::Victim
9033 
Other values (254)
40783 

Length

Max length1311
Median length1236
Mean length24.59072001
Min length8

Characters and Unicode

Total characters5893830
Distinct characters25
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique96 ?
Unique (%)< 0.1%

Sample

1st row0::Victim||1::Victim||2::Victim||3::Victim||4::Subject-Suspect
2nd row0::Victim||1::Victim||2::Victim||3::Victim||4::Subject-Suspect
3rd row0::Subject-Suspect||1::Subject-Suspect||2::Victim||3::Victim||4::Victim
4th row0::Victim||1::Victim||2::Victim||3::Subject-Suspect
5th row0::Victim||1::Victim||2::Victim||3::Subject-Suspect

Common Values

ValueCountFrequency (%)
0::Victim83427
34.8%
0::Victim||1::Subject-Suspect50579
21.1%
0::Subject-Suspect44914
18.7%
0::Victim||1::Subject-Suspect||2::Subject-Suspect10941
 
4.6%
0::Victim||1::Victim9033
 
3.8%
0::Subject-Suspect||1::Subject-Suspect8922
 
3.7%
0::Victim||1::Victim||2::Subject-Suspect6552
 
2.7%
0::Victim||1::Subject-Suspect||2::Subject-Suspect||3::Subject-Suspect3720
 
1.6%
0::Subject-Suspect||1::Subject-Suspect||2::Subject-Suspect3040
 
1.3%
0::Victim||1::Victim||2::Subject-Suspect||3::Subject-Suspect2107
 
0.9%
Other values (249)16442
 
6.9%

Length

2022-10-13T22:45:24.704715image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
0::victim83427
34.8%
0::victim||1::subject-suspect50579
21.1%
0::subject-suspect44914
18.7%
0::victim||1::subject-suspect||2::subject-suspect10941
 
4.6%
0::victim||1::victim9033
 
3.8%
0::subject-suspect||1::subject-suspect8922
 
3.7%
0::victim||1::victim||2::subject-suspect6552
 
2.7%
0::victim||1::subject-suspect||2::subject-suspect||3::subject-suspect3720
 
1.6%
0::subject-suspect||1::subject-suspect||2::subject-suspect3040
 
1.3%
0::victim||1::victim||2::subject-suspect||3::subject-suspect2107
 
0.9%
Other values (249)16442
 
6.9%

Most occurring characters

ValueCountFrequency (%)
:827562
14.0%
c616449
10.5%
t616449
10.5%
i435846
 
7.4%
S398526
 
6.8%
u398526
 
6.8%
e398526
 
6.8%
|351532
 
6.0%
0239883
 
4.1%
V217923
 
3.7%
Other values (15)1392608
23.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter3480771
59.1%
Other Punctuation827562
 
14.0%
Uppercase Letter616449
 
10.5%
Decimal Number418253
 
7.1%
Math Symbol351532
 
6.0%
Dash Punctuation199263
 
3.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
c616449
17.7%
t616449
17.7%
i435846
12.5%
u398526
11.4%
e398526
11.4%
m217923
 
6.3%
p199263
 
5.7%
s199263
 
5.7%
j199263
 
5.7%
b199263
 
5.7%
Decimal Number
ValueCountFrequency (%)
0239883
57.4%
1110860
26.5%
239613
 
9.5%
315996
 
3.8%
46478
 
1.5%
52721
 
0.7%
61287
 
0.3%
7702
 
0.2%
8417
 
0.1%
9296
 
0.1%
Uppercase Letter
ValueCountFrequency (%)
S398526
64.6%
V217923
35.4%
Other Punctuation
ValueCountFrequency (%)
:827562
100.0%
Math Symbol
ValueCountFrequency (%)
|351532
100.0%
Dash Punctuation
ValueCountFrequency (%)
-199263
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin4097220
69.5%
Common1796610
30.5%

Most frequent character per script

Common
ValueCountFrequency (%)
:827562
46.1%
|351532
19.6%
0239883
 
13.4%
-199263
 
11.1%
1110860
 
6.2%
239613
 
2.2%
315996
 
0.9%
46478
 
0.4%
52721
 
0.2%
61287
 
0.1%
Other values (3)1415
 
0.1%
Latin
ValueCountFrequency (%)
c616449
15.0%
t616449
15.0%
i435846
10.6%
S398526
9.7%
u398526
9.7%
e398526
9.7%
V217923
 
5.3%
m217923
 
5.3%
p199263
 
4.9%
s199263
 
4.9%
Other values (2)398526
9.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII5893830
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
:827562
14.0%
c616449
10.5%
t616449
10.5%
i435846
 
7.4%
S398526
 
6.8%
u398526
 
6.8%
e398526
 
6.8%
|351532
 
6.0%
0239883
 
4.1%
V217923
 
3.7%
Other values (15)1392608
23.6%

state_house_district
Real number (ℝ≥0)

HIGH CORRELATION

Distinct276
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean55.44713173
Minimum1
Maximum901
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.7 MiB
2022-10-13T22:45:24.834715image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5
Q127
median55.44713173
Q377
95-th percentile128
Maximum901
Range900
Interquartile range (IQR)50

Descriptive statistics

Standard deviation38.49714906
Coefficient of variation (CV)0.6943037063
Kurtosis21.37434483
Mean55.44713173
Median Absolute Deviation (MAD)25.55286827
Skewness1.990581042
Sum13289402.19
Variance1482.030486
MonotonicityNot monotonic
2022-10-13T22:45:24.953717image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
55.4471317338772
 
16.2%
183476
 
1.5%
103411
 
1.4%
323218
 
1.3%
313192
 
1.3%
33064
 
1.3%
22941
 
1.2%
342897
 
1.2%
62870
 
1.2%
132762
 
1.2%
Other values (266)173074
72.2%
ValueCountFrequency (%)
12056
0.9%
22941
1.2%
33064
1.3%
41928
0.8%
52603
1.1%
62870
1.2%
72059
0.9%
82475
1.0%
92492
1.0%
103411
1.4%
ValueCountFrequency (%)
9011
< 0.1%
8141
< 0.1%
8131
< 0.1%
8111
< 0.1%
8091
< 0.1%
8081
< 0.1%
8051
< 0.1%
8041
< 0.1%
8011
< 0.1%
7292
< 0.1%

state_senate_district
Real number (ℝ≥0)

HIGH CORRELATION

Distinct69
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20.47711028
Minimum1
Maximum94
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.7 MiB
2022-10-13T22:45:25.073717image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q110
median20.47711028
Q329
95-th percentile46
Maximum94
Range93
Interquartile range (IQR)19

Descriptive statistics

Standard deviation13.21168148
Coefficient of variation (CV)0.645192671
Kurtosis0.1855189135
Mean20.47711028
Median Absolute Deviation (MAD)9.477110282
Skewness0.6863824434
Sum4907892.361
Variance174.5485275
MonotonicityNot monotonic
2022-10-13T22:45:25.188715image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20.4771102832335
 
13.5%
510041
 
4.2%
97963
 
3.3%
37837
 
3.3%
47328
 
3.1%
26666
 
2.8%
155835
 
2.4%
65817
 
2.4%
195644
 
2.4%
145641
 
2.4%
Other values (59)144570
60.3%
ValueCountFrequency (%)
15206
2.2%
26666
2.8%
37837
3.3%
47328
3.1%
510041
4.2%
65817
2.4%
74742
2.0%
83962
 
1.7%
97963
3.3%
104542
1.9%
ValueCountFrequency (%)
941
 
< 0.1%
6769
 
< 0.1%
6633
 
< 0.1%
65124
 
0.1%
6420
 
< 0.1%
63867
0.4%
62350
0.1%
61177
 
0.1%
60204
 
0.1%
59487
0.2%

Interactions

2022-10-13T22:45:17.899337image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:04.290115image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:05.855696image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:07.523156image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:09.049198image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:10.708200image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:12.774372image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:14.776373image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:16.354373image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:18.053336image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:04.476681image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:06.002695image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:07.680159image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:09.217198image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:10.911199image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:12.976376image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:14.932372image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:16.515338image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:18.220340image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:04.630681image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:06.168777image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:07.833157image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:09.385198image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:11.131198image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:13.183372image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:15.083371image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:16.679337image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:18.383337image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:04.787682image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:06.333778image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:07.993709image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:09.549198image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:11.391198image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:13.386372image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:15.242372image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:16.841336image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:18.586337image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:04.989681image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:06.553162image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:08.194711image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:09.761200image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:11.683201image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:13.632374image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:15.450372image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:17.044337image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:18.797339image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:05.208933image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:06.756157image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:08.398710image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:09.974198image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:11.939198image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:13.865374image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:15.656372image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:17.247340image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:18.966337image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:05.375618image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:06.919155image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:08.560634image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:10.145199image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:12.148374image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:14.070372image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:15.820371image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:17.409336image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:19.138337image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:05.542616image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:07.085155image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:08.727635image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:10.321211image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:12.365373image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:14.283372image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:15.997378image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:17.582337image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:19.305339image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:05.700700image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:07.366155image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:08.885198image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:10.495205image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:12.571377image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:14.485376image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:16.172373image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-13T22:45:17.739336image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2022-10-13T22:45:25.283713image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-10-13T22:45:25.424887image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-10-13T22:45:25.555888image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-10-13T22:45:25.694888image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-10-13T22:45:19.621338image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2022-10-13T22:45:20.292336image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

incident_iddatestatecity_or_countyaddressn_killedn_injuredcongressional_districtgun_typelatitudelongituden_guns_involvedparticipant_ageparticipant_age_groupparticipant_genderparticipant_relationshipparticipant_statusparticipant_typestate_house_districtstate_senate_district
04611052013-01-01PennsylvaniaMckeesport1506 Versailles Avenue and Coursin Street1.1230644.00000014.00::Unknown40.346740.34671.3724420::200::Adult 18+||1::Adult 18+||2::Adult 18+||3::Adult 18+||4::Adult 18+0::Male||1::Male||3::Male||4::Female1::Significant others - current or former0::Arrested||1::Injured||2::Injured||3::Injured||4::Injured0::Victim||1::Victim||2::Victim||3::Victim||4::Subject-Suspect55.44713220.47711
14607262013-01-01CaliforniaHawthorne13500 block of Cerise Avenue1.0000003.00000043.00::Unknown33.90933.9091.3724420::200::Adult 18+||1::Adult 18+||2::Adult 18+||3::Adult 18+0::Male1::Significant others - current or former0::Killed||1::Injured||2::Injured||3::Injured0::Victim||1::Victim||2::Victim||3::Victim||4::Subject-Suspect62.00000035.00000
24788552013-01-01OhioLorain1776 East 28th Street1.0000003.0000009.00::Unknown||1::Unknown41.445541.44552.0000000::25||1::31||2::33||3::34||4::330::Adult 18+||1::Adult 18+||2::Adult 18+||3::Adult 18+||4::Adult 18+0::Male||1::Male||2::Male||3::Male||4::Male1::Significant others - current or former0::Injured, Unharmed, Arrested||1::Unharmed, Arrested||2::Killed||3::Injured||4::Injured0::Subject-Suspect||1::Subject-Suspect||2::Victim||3::Victim||4::Victim56.00000013.00000
34789252013-05-01ColoradoAurora16000 block of East Ithaca Place4.0000001.2182536.00::Unknown39.651839.65181.3724420::29||1::33||2::56||3::330::Adult 18+||1::Adult 18+||2::Adult 18+||3::Adult 18+0::Female||1::Male||2::Male||3::Male1::Significant others - current or former0::Killed||1::Killed||2::Killed||3::Killed0::Victim||1::Victim||2::Victim||3::Subject-Suspect40.00000028.00000
44789592013-07-01North CarolinaGreensboro307 Mourning Dove Terrace2.0000002.0000006.00::Handgun||1::Handgun36.11436.1142.0000000::18||1::46||2::14||3::470::Adult 18+||1::Adult 18+||2::Teen 12-17||3::Adult 18+0::Female||1::Male||2::Male||3::Female3::Family0::Injured||1::Injured||2::Killed||3::Killed0::Victim||1::Victim||2::Victim||3::Subject-Suspect62.00000027.00000
54789482013-07-01OklahomaTulsa6000 block of South Owasso4.0000001.2182531.00::Unknown36.240536.24051.3724420::23||1::23||2::33||3::550::Adult 18+||1::Adult 18+||2::Adult 18+||3::Adult 18+||4::Adult 18+||5::Adult 18+0::Female||1::Female||2::Female||3::Female||4::Male||5::Male1::Significant others - current or former0::Killed||1::Killed||2::Killed||3::Killed||4::Unharmed, Arrested||5::Unharmed, Arrested0::Victim||1::Victim||2::Victim||3::Victim||4::Subject-Suspect||5::Subject-Suspect72.00000011.00000
64793632013-01-19New MexicoAlbuquerque2806 Long Lane5.0000001.2182531.00::22 LR||1::223 Rem [AR-15]34.979134.97912.0000000::51||1::40||2::9||3::5||4::2||5::150::Adult 18+||1::Adult 18+||2::Child 0-11||3::Child 0-11||4::Child 0-11||5::Teen 12-170::Male||1::Female||2::Male||3::Female||4::Female||5::Male5::Family0::Killed||1::Killed||2::Killed||3::Killed||4::Killed||5::Unharmed, Arrested0::Victim||1::Victim||2::Victim||3::Victim||4::Victim||5::Subject-Suspect10.00000014.00000
74793742013-01-21LouisianaNew OrleansLaSalle Street and Martin Luther King Jr. Boulevard1.1230645.0000002.00::Unknown29.943529.94351.3724420::240::Adult 18+0::Male||1::Male||2::Male||3::Male||4::Male1::Significant others - current or former0::Injured||1::Injured||2::Injured||3::Injured||4::Injured0::Victim||1::Victim||2::Victim||3::Victim||4::Victim||5::Subject-Suspect93.0000005.00000
84793892013-01-21CaliforniaBrentwood1100 block of Breton Drive1.1230644.0000009.00::Unknown37.965637.96561.3724420::240::Teen 12-17||1::Teen 12-17||2::Teen 12-17||4::Adult 18+0::Male||1::Male||2::Male||3::Male||4::Male1::Significant others - current or former0::Injured||1::Injured||2::Injured||3::Injured||4::Unharmed0::Victim||1::Victim||2::Victim||3::Victim||4::Subject-Suspect11.0000007.00000
94921512013-01-23MarylandBaltimore1500 block of W. Fayette St.1.0000006.0000007.00::Unknown39.289939.28991.3724420::150::Teen 12-17||1::Adult 18+||2::Adult 18+||3::Adult 18+||4::Adult 18+||5::Adult 18+||6::Adult 18+0::Male1::Significant others - current or former0::Killed||1::Injured||2::Injured||3::Injured||4::Injured||5::Injured||6::Injured0::Victim||1::Victim||2::Victim||3::Victim||4::Victim||5::Victim||6::Victim55.44713244.00000

Last rows

incident_iddatestatecity_or_countyaddressn_killedn_injuredcongressional_districtgun_typelatitudelongituden_guns_involvedparticipant_ageparticipant_age_groupparticipant_genderparticipant_relationshipparticipant_statusparticipant_typestate_house_districtstate_senate_district
23966710822342018-03-31TennesseeMemphis2900 block of Wingate1.1230641.0000009.0000000::Unknown35.204535.20451.00::69||1::350::Adult 18+||1::Adult 18+0::Male||1::Female1::Significant others - current or former0::Injured||1::Arrested0::Victim||1::Subject-Suspect90.00000030.00000
23966810817422018-03-31MichiganDetroitI-961.1230641.0000008.0160840::Unknown0.00.01.00::240::Adult 18+0::Male1::Significant others - current or former0::Injured0::Victim55.44713220.47711
23966910829902018-03-31WisconsinMadisonHayes Rd1.1230641.2182538.0160840::45 Auto0.00.01.00::240::Adult 18+0::Male1::Significant others - current or former0::Injured0::Victim55.44713220.47711
23967010817522018-03-31IllinoisChicago1 block of N Paulina St1.1230641.0000008.0160840::Unknown0.00.01.00::360::Adult 18+||1::Adult 18+0::Male||1::Male1::Significant others - current or former0::Injured||1::Unharmed0::Victim||1::Subject-Suspect55.44713220.47711
23967110820612018-03-31WashingtonSpokane (Spokane Valley)12600 block of N Willow Crest Ln1.1230641.2182535.0000000::Unknown47.663847.66381.00::480::Adult 18+0::Male1::Significant others - current or former0::Unharmed, Arrested0::Subject-Suspect4.0000004.00000
23967210831422018-03-31LouisianaRayneNorth Riceland Road and Highway 901.1230641.2182538.0160840::Unknown0.00.01.00::250::Adult 18+0::Female1::Significant others - current or former0::Unharmed, Arrested0::Subject-Suspect55.44713220.47711
23967310831392018-03-31LouisianaNatchitoches247 Keyser Ave1.0000001.2182534.0000000::Unknown31.753731.75371.01::210::Adult 18+||1::Adult 18+0::Male||1::Male1::Significant others - current or former0::Killed||1::Unharmed, Arrested0::Victim||1::Subject-Suspect23.00000031.00000
23967410831512018-03-31LouisianaGretna1300 block of Cook Street1.1230641.0000002.0000000::Unknown29.923929.92391.00::210::Adult 18+0::Male1::Significant others - current or former0::Injured0::Victim85.0000007.00000
23967510825142018-03-31TexasHouston12630 Ashford Point Dr1.0000001.2182539.0000000::Unknown29.720129.72011.00::420::Adult 18+0::Male1::Significant others - current or former0::Killed0::Victim149.00000017.00000
23967610819402018-03-31MaineNorridgewock434 Skowhegan Rd2.0000001.2182532.0000000::Handgun||1::Shotgun44.729344.72932.00::58||1::620::Adult 18+||1::Adult 18+0::Female||1::Male1::Significant others - current or former0::Killed||1::Killed0::Victim||1::Subject-Suspect111.0000003.00000